Clustering and Prediction of Rankings Within a Kemeny Distance Framework
نویسندگان
چکیده
Rankings and partial rankings are ubiquitous in data analysis, yet there is relatively little work in the classification community that uses the typical properties of rankings. We review the broader literature that we are aware of, and identify a common building block for both prediction of rankings and clustering of rankings, which is also valid for partial rankings. This building block is the Kemeny distance, defined as the minimum number of interchanges of two adjacent elements required to transform one (partial) ranking into another. The Kemeny distance is equivalent to Kendall’s for complete rankings, but for partial rankings it is equivalent to Emond and Mason’s extension of . For clustering, we use the flexible class of methods proposed by Ben-Israel and Iyigun (Journal of Classification 25: 5–26, 2008), and define the disparity between a ranking and the center of cluster as the Kemeny distance. For prediction, we build a prediction tree by recursive partitioning, and define the impurity measure of the subgroups formed as the sum of all within-node Kemeny distances. The median ranking characterizes subgroups in both cases.
منابع مشابه
Computing Kemeny Rankings, Parameterized by the Average KT-Distance
The computation of Kemeny rankings is central to many applications in the context of rank aggregation. Unfortunately, the problem is NP-hard. Extending our previous work [AAIM 2008], we show that the Kemeny score of an election can be computed efficiently whenever the average pairwise distance between two input votes is not too large. In other words, Kemeny Score is fixed-parameter tractable wi...
متن کاملFixed-parameter algorithms for Kemeny rankings
The computation of Kemeny rankings is central to many applications in the context of rank aggregation. Given a set of permutations (votes) over a set of candidates, one searches for a “consensus permutation” that is “closest” to the given set of permutations. Unfortunately, the problem is NP-hard. We provide a broad study of the parameterized complexity for computing optimal Kemeny rankings. Be...
متن کاملHow similarity helps to efficiently compute Kemeny rankings
The computation of Kemeny rankings is central to many applications in the context of rank aggregation. Unfortunately, the problem is NP-hard. We show that the Kemeny score (and a corresponding Kemeny ranking) of an election can be computed efficiently whenever the average pairwise distance between two input votes is not too large. In other words, Kemeny Score is fixed-parameter tractable with r...
متن کاملیادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیکهای یادگیری معیار فاصله
Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...
متن کاملAggregation of Rankings in Figure Skating
We scrutinize and compare, from the perspective of modern theory of social choice, two rules that have been used to rank competitors in Figure Skating for the past decades. The first rule has been in use at least from 1982 until 1998, when it was replaced by a new one. We also compare these two rules with the Borda and the Kemeny rules. The four rules are illustrated with examples and with the ...
متن کامل